Skip to content

Claudex — a third engine: Claude leads a team of parallel Codex workers#90

Open
mega123-art wants to merge 12 commits into
mainfrom
feat/claudex-team-mode
Open

Claudex — a third engine: Claude leads a team of parallel Codex workers#90
mega123-art wants to merge 12 commits into
mainfrom
feat/claudex-team-mode

Conversation

@mega123-art

Copy link
Copy Markdown
Contributor

What

Adds Claudex, a third engine alongside Claude and Codex. In Claudex (Team mode) a lead Claude session fans out 1–4 Codex workers in parallel via one MCP tool (spawn_codex_subagents), then merges their work. The parallelism is visible (war-room card), and the team result carries a trust label.

Claudex is an identity, not a separate binary: it runs the claude binary with fan-out forced on. engineBinary() normalizes claudex → claude at every binary-facing boundary (spawn, inject, memory, resume), while storage/UI/badges keep the claudex identity (own tab, own sessions, purple accent).

Highlights

  • Engine, not a mode — three peer tabs: Claude · Codex · Claudex. Removed the old Team mode chip.
  • Auth-gated — the Claudex tab + onboarding card are dulled/locked until both Claude and Codex are signed in (claude leads, codex works). Updates live on login.
  • Parallel fan-outspawn_codex_subagents(tasks[])Promise.all, clamp 4, per-worker timeout, depth guard (workers never get the tool back).
  • Worker safety gates — coder workers can't run destructive/exfil commands or touch paths outside the project; researcher workers are read-only.
  • One plain-language approval before the team edits files; live marquee cues per worker.
  • Limit handling — usage/rate-limit errors (worker and lead engine) show a clean human message, not a raw stack.
  • UI — purple accent + a vector glyph (no emoji); war-room card with one tile per worker; long text wraps. Applied in both surfaces (VSCode inline webview + React/mobile webview, incl. the onboarding picker).

Tests / build

  • codexSubagent.spec.ts — fan-out clamp, safety classifiers, limit detection (7 tests).
  • core tsc clean; VSCode + webview build clean on top of latest main (rebased).

Plan: plans/claudex-team-mode.md.

🤖 Generated with Claude Code

mega123-art and others added 12 commits June 30, 2026 17:00
Add a "Team mode" where a lead Claude session spawns 1–4 Codex worker
subagents in parallel via one MCP tool (spawn_codex_subagents), reusing
the existing spawnCli runtime — no new orchestrator. Workers are spawned
directly, so they never get the tool back (depth guard is automatic).

- codexSubagent.ts: runCodexTask/runCodexTasks + the SDK MCP tool. Two
  worker gates — read-only researcher (default) vs writing coder (when
  the session is in claudex mode). Clamp 4, 5min timeout, 8k output cap.
- runtime/index.ts: mode==="claudex" enables writing workers; tool wired
  Claude-only.
- spawn.ts: map "claudex" mode to acceptEdits so the lead's merge-edits
  auto-apply.
- convert/claude.ts: surface the fan-out as a "Claudex" war-room action.
- webview: 🧬 Team mode chip + one-line consent; war-room worker cards +
  trust badge.

Plan: plans/claudex-team-mode.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Strengthen the spawn_codex_subagents description so Claude defaults to
splitting any 2+-part request into parallel workers itself, without the
user having to name them. Only skips fan-out for truly single-step tasks.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claudex is now its own engine tab alongside Claude and Codex, not a
permission mode. It's an IDENTITY (own tab, sessions, badge, purple
accent) that runs the claude binary with parallel-Codex fan-out forced
on — engineBinary() normalizes claudex->claude at every binary-facing
boundary (spawn, inject, memory, resume), while storage/UI keep the
claudex identity.

- contract: Cli union gains 'claudex' + engineBinary() helper
- runtime/spawn/session: route via engineBinary; claudex slot; team
  config (write workers + Task disabled) keyed on the engine
- webview (both surfaces): 3rd tab, purple accent on tab + typing box,
  claudex shares claude's model/mode catalogs, status/login map to claude
- mode picker no longer carries claudex (it's an engine now)
- UI de-emoji'd: war-room + tab use a purple vector glyph, not emoji

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Claudex tab is dulled + unclickable until BOTH Claude and Codex are
  signed in (claude leads, codex does the work). Clicking while locked
  explains why. Refreshes live on cliStatus + login completion. Applied
  in both webview surfaces.
- Detect usage/rate/quota limit errors from workers (isLimitError) and
  surface them plainly: the worker output is tagged [LIMIT], the marquee
  shows a rate-limit cue, and the tool appends a note telling the lead to
  inform the user instead of silently retrying.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Lead Claude/Codex usage/rate-limit errors now render a human message
("<engine> hit a usage/rate limit — try again in a bit, or switch
engine") instead of a raw stack, reusing isLimitError. Raw text kept
appended for debugging.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The pretty card-based PickEngine (mobile/React onboarding) now shows a
third Claudex card (purple + fan-out vector icon), locked + dulled until
both Claude and Codex are signed in — matching the composer tab gating.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Worker goals containing long unbreakable strings (file paths) overflowed
the tile and bled across the grid. Add overflow-wrap:anywhere + min-width:0
+ overflow:hidden on tiles in both surfaces (inline webview + React).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
On the claudex engine in Plan mode, the fan-out tool now dispatches
READ-ONLY Codex researchers (workers forced read-only when mode=plan),
so Claude can send a research team, gather their reports, then propose
the plan — without touching the repo.

- buildPassiveSpawn: claudexWrite = claudex && mode !== 'plan'
- canUseTool: always allow spawn_codex_subagents (incl. plan mode); the
  workers self-gate, so no extra approval in plan
- tool description: in read-only mode, tells Claude it's safe + encouraged
  to dispatch parallel researchers while planning, then write the plan

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@zo-sol

zo-sol commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

This is genuinely great work, Parth. The multi-brain teamwork, Claude leading a team of Codex workers, is the most "AgentNet" idea in the whole product, and you built it cleanly (the engineBinary identity trick and the automatic depth guard are really nice). I fully want this to ship.

One idea on how to roll it out, building on what you have rather than changing it:

Ship Claudex as a discoverable easter egg first, with a clear test window, then promote it to a main feature.

Instead of adding a third engine tab right away, keep the surface as Claude / Codex, and make Claudex something you find: drag the Claude chip onto the Codex chip, a fusion effect plays, and that session becomes Claudex. It's eye-catching enough that people will notice it and talk about it, and because it only activates when both engines are signed in, it naturally scopes who's exercising it.

Two reasons I like this rollout:

  1. A clearer test period. Landing it as an easter egg first gives us a defined window to watch the write-capable team on real sessions (parallel writes, the one-approval flow, rate-limit handling) before it becomes a headline feature everyone leans on. It's the same code you wrote, just introduced with room to observe it.

  2. Easter egg to main feature is a stronger arc. Discoverable, then promoted, concentrates attention: people find it, share it, and when we flip it to first-class it already has a story and some buzz behind it. A third tab on day one spends all of that at once.

The reason underneath: we're a vibe-coding app, and Claude + Codex are just the starting pair. I want headroom to test more engines and combos, and introducing Claudex this way lets us trial the multi-engine direction with a real feedback loop first.

Everything in packages/core can stay exactly as you built it. The only surface change would be swapping the visible Claudex chip and onboarding card for the drag-to-fuse gesture on the two existing chips. Happy to take that part if it helps.

@zo-sol

zo-sol commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Follow-up: implementation spec for the "drag to fuse" easter egg

Sharing a concrete build plan so this is easy to pick up. It is purely additive UI on top of your PR. Nothing in packages/core changes, so the runtime, approval gate, depth guard, and disallowedTools all stay exactly as you built them.

Files touched (all in surfaces/webview, the shared React app)

  • src/chat/Composer.tsx - the an-term-seg engine strip. Render only ["claude", "codex"]; add the pointer-drag gesture, the Codex drop target, the fusion animation, and a fused "Claudex" pill with an un-fuse affordance.
  • src/onboarding/PickEngine.tsx - drop the Claudex card, back to Claude / Codex only.
  • src/index.css - fusion keyframes and drag styles (reusing the --claudex accent token this PR already added).
  • src/state/store.tsx - no reducer change needed. selectEngine("claudex") already maps to the claude binary and sets cli: "claudex", so we reuse it as-is. Only a tiny local component state for the in-progress drag.

Because the UI lives in the shared webview, this ships to VSCode, the Android app, and localhost from one change. surfaces/cli has no DOM, so Claudex stays reachable there only programmatically, which is fine for now.

Interaction spec

Use Pointer Events (pointerdown / pointermove / pointerup), not HTML5 drag-and-drop, because native DnD is unreliable in the Android WebView. Set touch-action: none on the Claude chip only while a drag is armed so it does not fight page swipe or the chip strip's own scroll.

  • Desktop (VSCode / browser): press and drag the Claude chip.
  • Mobile (Android): long-press (~250 ms) to pick up Claude, then drag. The threshold keeps a normal tap = select and a horizontal move = scroll.
  • While dragging: a lightweight clone of the Claude chip follows the pointer. The Codex chip becomes a drop target and glows --claudex only when both engines are signed in (state.cliReport?.claude === "ok" && state.cliReport?.codex === "ok", the same condition you already compute as locked, inverted).
  • Drop on Codex (hit-test the pointer against the Codex chip's bounding rect): play the fusion effect, then call selectEngine("claudex").
  • Drop anywhere else / released early: the clone snaps back, no-op.
  • Un-fuse: when state.cli === "claudex", the strip collapses to a single fused Claudex pill. Tap it (or long-press to split) to call selectEngine("claude") and return to the two-chip strip.

Discoverability (hidden, but eye-catching)

No label, tooltip, or onboarding copy advertises it. The single reveal is the Codex chip lighting up --claudex the instant you start dragging Claude, and only if both engines are ready. That is enough to be found by anyone who experiments, and invisible otherwise. If we want a fainter breadcrumb later, an occasional subtle magnet shimmer between the two chips is a one-line addition, but I would start with nothing.

The fusion effect

Short and physical, roughly 500 to 700 ms, pure CSS keyframes plus a transient fusing state, no library:

  1. The two chips ease toward each other and merge into one pill.
  2. A brief purple burst (radial --claudex glow / particle ring) fires at the merge point.
  3. The node-graph glyph from PickEngine (the three-circle vector) pops in on the fused pill, then settles into the steady Claudex label.

Respect prefers-reduced-motion: fall back to a simple crossfade into the fused pill with no burst.

Gating and safety (reused, not rebuilt)

The both-engines requirement is enforced physically: with only one engine signed in, the Codex target never glows and the drop is a no-op, so you literally cannot fuse without both. Everything downstream (the write-capable coder gate, the one plain-language approval, rate-limit handling) is unchanged from this PR.

Acceptance criteria

  • Desktop: drag Claude onto Codex fuses; dropping elsewhere is a no-op.
  • Mobile: long-press drag fuses and does not break chip scroll or page swipe.
  • Only one engine signed in: no glow, no fuse.
  • prefers-reduced-motion: crossfade path, no burst.
  • Un-fuse returns to Claude / Codex.
  • No visible "Claudex" anywhere else (no third chip, no PickEngine card).
  • A fused session is stored and badged as claudex exactly as it is today.

Why this is low risk

One component's worth of gesture logic plus CSS, no new dependencies, no core changes, and fully reversible: promoting Claudex to a main feature later is just re-adding the visible chip. The easter egg and the clear test window both come for free from the gate that is already in your PR.

Happy to take this part on top of your branch if that helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants